Search CORE

6,158 research outputs found

Lensless Imaging by Compressive Sensing

Author: Huang Gang
Jiang Hong
Matthews Kim
Wilford Paul
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 30/05/2013
Field of study

In this paper, we propose a lensless compressive imaging architecture. The architecture consists of two components, an aperture assembly and a sensor. No lens is used. The aperture assembly consists of a two dimensional array of aperture elements. The transmittance of each aperture element is independently controllable. The sensor is a single detection element. A compressive sensing matrix is implemented by adjusting the transmittance of the individual aperture elements according to the values of the sensing matrix. The proposed architecture is simple and reliable because no lens is used. The architecture can be used for capturing images of visible and other spectra such as infrared, or millimeter waves, in surveillance applications for detecting anomalies or extracting features such as speed of moving objects. Multiple sensors may be used with a single aperture assembly to capture multi-view images simultaneously. A prototype was built by using a LCD panel and a photoelectric sensor for capturing images of visible spectrum.Comment: Accepted ICIP 2013. 5 Pages, 7 Figures. arXiv admin note: substantial text overlap with arXiv:1302.178

arXiv.org e-Print Archive

Crossref

Iterative Object and Part Transfer for Fine-Grained Recognition

Author: Jiang Yu-Gang
Shen Zhiqiang
Wang Dequan
Xue Xiangyang
Publication venue
Publication date: 29/03/2017
Field of study

The aim of fine-grained recognition is to identify sub-ordinate categories in images like different species of birds. Existing works have confirmed that, in order to capture the subtle differences across the categories, automatic localization of objects and parts is critical. Most approaches for object and part localization relied on the bottom-up pipeline, where thousands of region proposals are generated and then filtered by pre-trained object/part models. This is computationally expensive and not scalable once the number of objects/parts becomes large. In this paper, we propose a nonparametric data-driven method for object and part localization. Given an unlabeled test image, our approach transfers annotations from a few similar images retrieved in the training set. In particular, we propose an iterative transfer strategy that gradually refine the predicted bounding boxes. Based on the located objects and parts, deep convolutional features are extracted for recognition. We evaluate our approach on the widely-used CUB200-2011 dataset and a new and large dataset called Birdsnap. On both datasets, we achieve better results than many state-of-the-art approaches, including a few using oracle (manually annotated) bounding boxes in the test images.Comment: To appear in ICME 2017 as an oral pape

arXiv.org e-Print Archive

Crossref

Learning Fashion Compatibility with Bidirectional LSTMs

Author: Davis Larry S.
Han Xintong
Jiang Yu-Gang
Wu Zuxuan
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 18/07/2017
Field of study

The ubiquity of online fashion shopping demands effective recommendation services for customers. In this paper, we study two types of fashion recommendation: (i) suggesting an item that matches existing components in a set to form a stylish outfit (a collection of fashion items), and (ii) generating an outfit with multimodal (images/text) specifications from a user. To this end, we propose to jointly learn a visual-semantic embedding and the compatibility relationships among fashion items in an end-to-end fashion. More specifically, we consider a fashion outfit to be a sequence (usually from top to bottom and then accessories) and each item in the outfit as a time step. Given the fashion items in an outfit, we train a bidirectional LSTM (Bi-LSTM) model to sequentially predict the next item conditioned on previous ones to learn their compatibility relationships. Further, we learn a visual-semantic space by regressing image features to their semantic representations aiming to inject attribute and category information as a regularization for training the LSTM. The trained network can not only perform the aforementioned recommendations effectively but also predict the compatibility of a given outfit. We conduct extensive experiments on our newly collected Polyvore dataset, and the results provide strong qualitative and quantitative evidence that our framework outperforms alternative methods.Comment: ACM MM 1

arXiv.org e-Print Archive

Crossref